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Abstract. Many data dissemination and publish-subscribe systems that 
guarantee the privacy and authenticity of the participants rely on sym- 
metric key cryptography. An important problem in such a system is to 
maintain the shared group key as the group membership changes. We 
consider the problem of determining a key hierarchy that minimizes the 
average communication cost of an update, given update frequencies of 
the group members and an edge-weighted undirected graph that cap- 
tures routing costs. We first present a polynomial-time approximation 
scheme for minimizing the average number of multicast messages needed 
for an update. We next show that when routing costs are considered, 
the problem is NP-hard even when the underlying routing network is a 
tree network or even when every group member has the same update 
frequency. Our main result is a polynomial time constant-factor approx- 
imation algorithm for the general case where the routing network is an 
arbitrary weighted graph and group members have nonuniform update 
frequencies. 



1 Introduction 

A number of data dissemination and publish-subscribe systems, such as interac- 
tive gaming, stock data distribution, and video conferencing, need to guarantee 
the privacy and authenticity of the participants. Many such systems rely on sym- 
metric key cryptography, whereby all legitimate group members share a common 
key, henceforth referred to as the group key, for group communication. An impor- 
tant problem in such a system is to maintain the shared group key as the group 
membership changes. The main security requirement is confidentiality: only valid 
users should have access to the multicast data. In particular this means that any 
user should have access to the data only during the time periods that the user 
is a member of the group. 

There have been several proposals for multicast key distribution for the In- 
ternet and ad hoc wireless networks |2|7|8|18|24| . A simple solution proposed in 
early Internet RFCs is to assign each user a user key; when there is a change 
in the membership, a new group key is selected and separately unicast to each 
of the users using their respective user keys |8l7j . A major drawback of such a 
key management scheme is its prohibitively high update cost in scenarios where 
member updates are frequent. 



The focus of this paper is on a natural key management approach that uses 
a hierarchy of auxihary keys to update the shared group key and maintain the 
desired security properties. Variations of this approach, commonly referred to as 
the Key Graph or the Logical Key Hierarchy scheme, were proposed by several 
independent groups of researchers |2l4l21l23l24j . The main idea is to have a 
single group key for data communication, and have a group controller (a special 
server) distribute auxiliary subgroup keys to the group members according to 
a key hierarchy. The leaves of the key hierarchy are the group members and 
every node of the tree (including the leaves) has an associated auxiliary key. 
The key associated with the root is the shared group key. Each member stores 
auxiliary keys corresponding to all the nodes in the path to the root in the 
hierarchy. When an update occurs, say at member u, then all the keys along the 
path from u to the root are rekeyed from the bottom up (that is, new auxiliary 
keys are selected for every node on the path). If a key at node v is rekeyed, 
the new key value is multicast to all the members in the subtree rooted at v 
using the keys associated with the children of v in the hierarchy]^ A detailed 
example is given in Figure [l] It is not hard to see that the above key hierarchy 
approach, suitably implemented, yields an exponential reduction in the number 
of multicast messages needed on a member update, as compared to the scheme 
involving one auxiliary key per user. 

The effectiveness of a particular key hierarchy depends on several factors in- 
cluding the organization of the members in the hierarchy, the routing costs in the 
underlying network that connects the members and the group controller, and the 
frequency with which individual members join or leave the group. Past research 
has focused on either the security properties of the key hierarchy scheme ^ or 
concentrated on minimizing either the total number of auxiliary keys updated or 
the total number of multicast messages [22J , not taking into account the routing 
costs in the underlying communication network. 

1.1 Our contributions 

In this paper, we consider the problem of designing key hierarchies that minimize 
the average update cost, given an arbitrary underlying routing network and given 
arbitrary update frequencies of the members, which we refer henceforth to as 
weights. Let S denote the set of all group members. For each member v, we are 
given a weight representing the update probability at v (e.g., a join/leave 
action at v). Let G denote an edge- weighted undirected routing network that 
connects the group members with a group controller r. The cost of any multicast 
from r to any subset of S is determined by G. The cost of a given key hierarchy 
is then given by the weighted average, over the members v, of the sum of the 
costs of the multicasts performed when an update occurs at v. A formal problem 
definition is given in Section [2] 

^ We emphasize here that auxiliary keys in the key hierarchy are only used for main- 
taining the group key. Data communication within the group is conducted using the 
group key. 



• We first consider the objective of minimizing the average number of multicast 
messages needed for an update, which is modeled by a routing tree where the 
multicast cost to every subset of the group is the same. For uniform multicast 
costs, we precisely characterize the optimal hierarchy when all the member 
weights are the same, and present a polynomial-time approximation scheme 
when member weights are nonuniform. These results appear in Section [3] 

• We next show in Section]?] that the problem is NP-hard when multicast costs 
are nonuniform, even when the underlying routing network is a tree or when 
the member weights are uniform. 

• Our main result is a constant-factor approximation algorithm in the general 
case of nonuniform member weights and nonuniform multicast costs captured 
by an arbitrary routing graph. We achieve a 75-approximation in general, and 
achieve improved constants of approximation for tree networks (11 for nonuni- 
form weights and 4.2 for uniform weights). These results are in Section |5] 

Our approximation algorithms are based on a simple divide-and-conquer 
framework that constructs "balanced" binary hierarchies by partitioning the 
routing graph using both the member weights and the routing costs. A key in- 
gredient of our result for arbitrary routing graphs is the algorithm of |14j which, 
given any weighted graph, finds a spanning tree that simultaneously approxi- 
mates the shortest path tree from a given node and the minimum spanning tree 
of the graph. 

We have formulated the key hierarchy design as a static optimization prob- 
lem, capturing the update frequencies as weights instead of exphcitly modeling 
the time- varying membership of the group. Our formulation is applicable in sce- 
narios where (a) the communication group is large with frequent updates, yet 
the update probability of any individual member is small; or (b) an update at 
a member may occur due to reasons other than change in membership, e.g., if 
the key is compromised, or if each "member" in the problem formulation ac- 
tually represents a collection of members in a local network, one of whom is 
joining/leaving; or (c) the key hierarchy is periodically redesigned by solving the 
static optimization problem. Furthermore, the key hierarchies that we design in 
this paper are simple and may be amenable to maintain efficiently in a dynamic 
setting. We plan to investigate this aspect in future work. 

1.2 Related work 

Variants of the key hierarchy scheme studied in this paper were proposed by sev- 
eral independent groups |2l4l21l23l24j . The particular model we have adopted 
matches the Key Graph scheme of where they show that a balanced hi- 
erarchy achieves an upper bound of O(logn) on the number of multicast mes- 
sages needed for any update in a group of n members. In |22] . it is shown that 
0(logn) messages are necessary for an update in the worst case, for a general 
class of key distribution schemes. Lower bounds on the amount of communica- 
tion needed under constraints on the number of keys stored at a user are given 
in |3] . Information-theoretic bounds on the number of auxiliary keys that need 
to be updated given member update frequencies are given in [T9] . 



In recent work, 16J and ^20. have studied the design of key hierarchy schemes 
that take into account the underlying routing costs and energy consumption in 
an ad hoc wireless network. The results of |16l20j . which consist of hardness 
proofs, heuristics, and simulation results, are closely tied to the wireless net- 
work model, relying on the broadcast nature of the medium. In this paper, we 
present approximation algorithms for a more basic routing cost model given by 
an undirected weighted graph. 

The special case of uniform multicast costs (with nonuniform member weights) 
bears a strong resemblance to the Huffman encoding problem IT]. Indeed, it can 
be easily seen that an optimal binary hierarchy in this special case is given by 
the Huffman code. The truly optimal hierarchy, however, may contain internal 
nodes of both degree 2 and degree 3, which contribute different costs, respec- 
tively, to the leaves. In this sense, the problem seems related to Huffman coding 
with unequal letter costs [12 , for which a PTAS is given in [6J. The optimiza- 
tion problem that arises when multicast costs and member weights are both 
uniform also appears as a special case of the constrained set selection problem, 
formulated in the context of website design optimization [10]. Another related 
problem is broadcast tree scheduling where the goal is to determine a sched- 
ule for broadcasting a message from a source node to all the other nodes in a 
heterogeneous network where different nodes may incur different delays between 
consecutive message transmissions |13ll7j . Both the Key Hierarchy Problem and 
the Broadcast Tree problem seek a rooted tree in which the cost for a node may 
depend on the degrees of the ancestors; however, the optimization objectives are 
different. 

As mentioned in Section |1.1| our approximation algorithm for the general 
key hierarchy problem uses the elegant algorithm of [TT] for finding spanning 
trees that simultaneously approximates both the minimum spanning tree weight 
and the shortest path tree weight (from a given root). Such graph structures, 
commonly referred to as shallow-light trees have been extensively studied (e.g., 

see mn]). 

2 Problem definition 

An instance of the Key Hierarchy Problem is given by the tuple {S,w,G,c), 
where S is the set of group members, w : S ^ Z is the weight function (cap- 
turing the update probabilities), G = (V,E) is the underlying communication 
network with V ^ SU {r} where r is a distinguished node representing the group 
controller, and c : E ^ Z gives the cost of the edges in G. 

Fix an instance {S,w,G,c). We define a hierarchy on a set X C 5 to be a 
rooted tree H whose leaves are the elements of X. For a hierarchy T over X, 
the cost of a member x Cz X with respect to T is given by 

E E ^^(^") (1) 

ancestor u of x child v of u 

where is the set of leaves in the subtree of T rooted at v and for any set 
Y C S, M{Y) is the cost of multicasting from the root r to F in G. The cost 



of a hierarchy T over X is then simply the sum of the weighted costs of all 
the members of X with respect to T. The goal of the Key Hierarchy Problem 
is to determine a hierarchy of minimum cost. An example instance of the Key 
Hierarchy Problem, together with the calculation of the cost of a candidate 
hierarchy for the instance, is given in Figure [l] We introduce some notation 
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Fig. 1. An instance of the Key Hierarchy Problem with 9 group members, connected to 
the group controller by a tree given in (a). Suppose the update frequency of every group 
member is 1 and the cost of every edge in the routing tree 1. An update at member 
U4 will require the rekeying of keys K5, K2, and Ki. Key K5 is rekeyed by unicasting 
to members U3, U4 and U5 at a cost of 3 each. Key K2 is rekeyed by multicasting 
to {Ui,U2} and to {U3,U4,U5} at a cost of 3 and 5, respectively. Finally, key Ki is 
rekeyed by multicasting to {Ui, U2, U3, U4, U5}, to {Ug} and to {U7, Us, U9} at a cost 
of 7, 1, and 4, respectively. Thus, the total cost of an update at member U4 is 29. Using 
similar calculations, the average cost of an update can be determined to be 219/9. 



that is useful for the remainder of the paper. We use OPT(S') to denote the cost 
of an optimal hierarchy for S. We extend the notation W to hierarchies and to 
sets of members: for any hierarchy T (resp., set X of members), W{T) (resp., 
W{X)) denotes the sum of the weights of the leaves of T (resp., members in 
X). Our algorithms often combine a set Ti of two or three hierarchies to form 
another hierarchy T' : combine(7i) introduces a new root node R, makes the root 
of each hierarchy in 7i as a child of R, and returns the hierarchy rooted at R. 

Using the above notation, a more convenient expression for the cost of a hier- 
archy T over X is the following reorganization of the summation in Equation [l] 



E 

child V of 1 



M(r„) 



(2) 



3 Uniform multicast cost 



In this section, we consider the special case of the Key Hierarchy problem where 
the multicast cost to any subset of group members is the same. Thus, the objec- 



tive is to minimize the average number of multicast messages sent for an update. 
We note that the number of multicast messages sent for an update at a member 
u is simply the sum of the degrees of its ancestors in the hierarchy (as is evident 
from Equation [ij . We start by establishing a basic structural property of an 
optimal hierarchy and a lower bound on the optimum cost. 

Lemma 1. For any given member set S with at least two members, there exists 
an optimal hierarchy in which the degree of every internal node is either two or 
three. 

Proof. Let T* be an optimal hierarchy for S. Since any internal node with degree 
one can be replaced by its child, yielding a decrease in cost, the degree of every 
internal node of T* is at least two. Let, if possible, v be an internal node of T* 
with degree d> 4. We divide its children into two groups Ci and C2, containing 
\d/2~\ and \_d/2\ children, respectively. We add two new internal nodes vi and 
V2, make them children of v, and set vi and V2 to be the parents of the nodes in 
Ci and C2, respectively. 

We now consider the cost of the new hierarchy. The cost of any member that 
does not have w as an ancestor in T* does not change. The cost of a member 
that has v as an ancestor in T* decreases by at least d — \d/2] —2 > 0; thus, 
this cost is nonincreasing. If d > 4, there exists a member whose cost decreases 
by at least d — ld/2\ — 2 > 0, contradicting the optimality of T*. If d = 4, then 
we have a new hierarchy whose cost is no more than that of T* and has fewer 
internal nodes with degree greater than three. Repeating this process until there 
are no internal nodes with degree greater than 3 yields the desired claim. □ 

Lemma 2. For any member set S , we have OPT{S) > X^^es log3(M^(S')/w„) 

Proof. The proof is by induction on the size of S. The claim is trivially true 
for \S\ — 1. For the induction hypothesis, we assume that the claim is true for 
member sets of size less than m > 2. Consider an optimal hierarchy for S with 
\S\ = m > 2. Let the degree of the root be d, and let the member set in the 
subtree rooted at the ith child be Si with \Si\ — nii, 1 < i < d. We place the 
following lower bound on OPT(S'): 

OFT{S)>dW{S)+ ^ 3u;aog3(W^(5,)M) 

l<i<dveSi 

^ dW{S) +SWiS) log3 W^(S'i) - ^ 3w„ logg «;„ 

l<i<d v£S 

> dW{S) + 3W{S) log3(W^(S') /d) - ^ 3w^ log^ 

= dW{S) - 3W{S) log3 d+J2^Wy log3(W{S)/wy) 

> ^3«;,log3(iy(5)/u;„). 

ves 

(The third step follows from the convexity of xlog^x, the last step from d > 
Sloggd, Vd > 1.) □ 



3.1 Structure of an optimal hierarchy for uniform member weights 

When all the members have the same weight, we can easily characterize an 
optimal key hierarchy by recursion. Let n be the number of members. When 
n = 1, the key hierarchy is just a single node tree. When n = 2, the key hierarchy 
is a root with two leaves as children. When n = 3, the key hierarchy is a root with 
three leaves as children. When n > 3, we are going to build this key hierarchy 
recursively. First divide n members into 3 balanced groups, i.e. the size of each 
group is between [n/3j and ["./3]. Then the key hierarchy is a root with 3 
children, each of which is the key hierarchy of one of the 3 groups built recursively 
by this procedure. It is easy to verify that the cost of this hierarchy is given by: 



The following theorem is due to |9ll0j . where this scenario arises as a special 
case of the constrained set selection problem. For completeness, we present an 
alternative shorter proof here. 

Theorem 1 ( |9pi0] ). For uniform multicast costs and member weights, the 
above key hierarchy is optimal. 

Proof. We prove this by induction on the number of members. Let n be the 
number of members. For the base case (n < 5) we can check the optimality by 
brute-force. For inductive step {n> 6), we first make two observations: optimal 
key hierarchies have an optimal substructure property; and / is a convex function 
of n. 

By Lemma [l] we know there exists an optimal hierarchy in which the degree 
of the root is either two or three. We first consider the cse where the degree of 
the root is two. Since optimal key hierarchies satisfy the optimal substructure 
property, it must be the case that the sub-hierarchies rooted at the two children 
of the root must be optimal for the number of members in their respective 
subtrees. Thus, by the induction hypothesis, the cost of the optimal hierarchy 
equals f{ni) + f{n — ni) + 2n, where ni is the number of members in the subtree 
rooted at one of the children of the root. Since n > 6, the convexity of / implies 
that each subtree has at least 3 members. From the induction hypothesis, it 
also follows that the root of each subtree has degree 3. Let the two children of 
the root be ui and U2. Let the children of Ui be un, Ui2, and ua, 1 < i < 2. 
We transform this hierarchy into another key hierarchy with the same cost by 
adding a third child U3 to the root that has as its children and U23. The cost 
of every member in the new hierarchy remains the same as that in the optimal 
hierarchy, which means this new hierarchy is also optimal and its root has degree 



So we now focus on the case where there exists an optimal hierarchy in 
which the root has degree 3. Let the three children of the root have ni, n2, and 
na members, respectively. It follows that the cost of the optimal hierarchy equals 
f{ni) + /(n2) -I- f{n^) + 3n. The convexity of / implies that the preceding cost is 
minimized when each of ni, n2, and 713 is either [n/3j or [n./3] . This is precisely 
the proposed hierarchy, thus completing the proof of the theorem. □ 




3. 



3.2 A polynomial-time approximation scheme for nonuniform 
member weights 

We give a polynomial-time approximation sciieme for the Key Hierarchy Problem 
when the multicast cost to every subset of the group is identical and the members 
have arbitrary weights. Given a positive constant e, we present an polynomial- 
time algorithm that produces a (1 -I- 0(e))-approximation. We assume that 1/e 
is a power of 3; if not, we can replace £ by a smaller constant that satisfies 
this condition. We round the weight of every member up to the nearest power 
of (1 + e) at the expense of a factor of 1 -I- e in approximation. Thus, in the 
remainder we assume that every weight is a power of (1 + e). Our algorithm 
PTAS(S'), which takes as input a set 5" of members with weights, is as follows. 

1. Divide S into two sets, a set H of the 3 members with the largest weight 
and the set L = S — H . 

2. Initialize £ to be the set of hierarchies consisting of one dcpth-0 hierarchy for 
each member of L. 

3. Repeat the following step until it can no longer be executed: if Ti, T2, and T3 
are hierarchies in C with identical weight, then replace Ti, T2, and T3 in C by 
combine({Ti, T3}). (Recall the definition of combine from Section|2]) 

4. Repeat the following step until £ has one hierarchy: replace the two hierarchies 
Ti, T2 with least weight by combine({Ti, T2}). Let denote the hierarchy in 
C. 

5. Compute an optimal hierarchy T* for H. Determine a node in T* that has 
weight at most W{S)e and height at most 1/e. We note that such a node exists 
since every hierarchy with at least £ leaves has a set N of at least 1 /e nodes 
at depth at most 1/e with the property that no node in N is an ancestor of 
another. Set the root of Tl as the child of this node. Return T* . 

We now analyze the above algorithm. At the end of step 3, the cost of any 
hierarchy T in £ is equal to ^^^j, 3wu log3(T4^(T)/wi,). If £ is the hierarchy 
set at the end of step 3, then the additional cost incurred in step 4 is at most 
j:^^^2W{T)log2{W{L)/WiT)). 

Since there are at most two hierarchies in any weight category in £ at the start 
of step 4, at least 1 — 1/e^ of the weight in the hierarchy set is concentrated in the 
heaviest 4/e^ hierarchies of £. Step 4 is essentially the Huffman coding algorithm 
and yields an optimal binary hierachy. Using Lemma [3] of Section [5] we note 
that it achieves a 3-approximation. (In fact, one can show using a more careful 
argument that it achieves an approximation of 21g((l -I- •\/5)/2)/(31g3) « 1.52, 
but the factor 3 will suffice for our purposes here.) This yields the following 
bound on the increase in cost due to step 4: 

3 {e^W{L) logi+, 3 -I- (1 - e^)W{L) log^iA/s^)) < W{L)/e, 

for e sufficiently small. The final step of the algorithm increases the cost by at 
most W{L)/e + eW{S). Thus, the total cost of the final hierarchy is at most 



OPT{H) + OPT(L) + W{L)/s + W{L) je + eW{S) 

< OPT(iJ) + OPT(L) + 2£0PT(S') + eOPT(S') 

< (l + 3e)0PT(S'). 

(The second step holds since 0PT(5) > Ev€L'^vlog3iW{S)/w^) > W{L)/e^.) 
4 Hardness results 

In this section, we present the hardness results for Key Hierarchy Problem with 
nonuniform multicast cost. First we show that the problem is strongly NP- 
complete if group members have different weights and the underlying routing 
network is a tree. Then we show the problem is also NP-complete if group mem- 
bers have the same weights and the underlying routing network is a general 
graph. 



4.1 Weighted key hierarchy problem with routing tree 

Our reduction is from the NP-complete problem 3-Partition, which is defined 
as follows [5j. The input consists of a set A of 3m elements, a bound B G 
and a set of sizes S{a) e Z+ for each a ^ A such that B/A < S{a) < B/2, 
and J2aeA ^('^) ~ "^-S- The goal of the problem is to determine whether A can 
be partitioned into m disjoint sets Ai, A2, . . . , Am such that for 1 < i < m, 
Ea^A. S{a) = B. 

Theorem 2. When group members have different weights and the routing net- 
work is a tree, the Key Hierarchy Problem is NP-complete. 

Proof. The membership in NP is immediate. We reduce 3-partition to the Key 
Hierarchy Problem. Let P denote the given 3-Partition instance. If the number 
3m of elements in the P is not a power of three, then we add new elements in 
groups of three with sizes B, 0, and 0, respectively, to make the total number of 
elements a power of 3. It is easy to verify that the original problem instance has 
the desired partition if and only if the new instance has the desired partition. 
Thus, for the remainder of the proof, we assume that the number of elements, 
3m, in P is a power of 3. 

In P, let set A be {ai, 02, ... , 03™}, and the size of element in set A be w[. 
We create a routing tree T consisting of a root r connected to a single internal 
node u, which in turn has edges to 3m leaves Vi for i — 1,2,..., 3m, one for 
each of the 3m members. Root r is the group controller. For member i, we set 
its weight Wi to be w 4- w', where w is chosen such that < 3- 3m logg 3™+i 

where Wmax = niaxjlwj} and Wmin = niin^luii}. We set the cost of edge (r, u) 
to be C, a constant which will be specified later, and the cost of (m, Vi) to 
be Wi for i = 1, 2, ... , 3m, and the weight of leaf Vi to be Wi. We now show 



that P has a partition if and only if the optimal key hierarchy of T has cost 

C ■ 3M^log3 3m + VF^ • (1 + 1/3 + 1/9 H h 1/m), where W is the sum of the 

weights of all the members. 

If we set C > log^ 3m, then the cost of an optimal key hierarchy is smaller 
than C • 3 • 3TOlog3 3m • Wmax, which is the optimal cost for 3m members, each 
with weight Wmax- In an optimal key hierarchy, every internal node has degree 
3, since otherwise its cost is at least C • (3 • 3TOlog3 3m + 1) • Wmin, which is not 
optimal given that < 3- 3m logg 3m+i ^ g ^ balanced degree-3 tree is the 

only optimal key hierarchy in this case. In such a hierarchy, the cost contributed 
by edge (r, u) is exactly C ■ 3W^ log3 3m. Let Ci denote the set of nodes at depth 
i in the hierarchy, the depth of the root being set to 0. By Equation [2j the cost 
contributed by edges {u, Vi), i = 1,2,..., 3m, equals 

^ J2W{T,)-{M{T,)-C) 

0<i<log3 m x^d 
0<i<log3 m x£Ci 

>W^ + 3{W/3f + 9{W/9f + ■■■ + m{W/m)'^. 

In the last step, equality only holds when W{Tx) — for all x G Ci (by 

Jensen's inequality). Thus, the 3-partition problem has a solution if and only if 
the optimal key hierarchy achieves its minimum, which is C ■ 3W log3 3m + ■ 
(l + l/3+l/9+--- + l/m). □ 

4.2 Unweighted key hierarchy problem 

Our reduction is from the NP-complete 3D-MATCHING problem which is defined 
as follows [5 . We are given finite disjoint sets W, U, V of size q, and a set of triples 
AI C W X U xV. The goal is to determine whether there are q pairwise disjoint 
triples. 

Theorem 3. When group members have the same key update weights and the 
routing network is a general graph, the Key Hierarchy Problem is NP-complete. 

Proof. We reduce 3D-Matching to the Key Hierarchy problem. Let / be a 
given instance of 3D- Matching. If the set size q is not a power of 3 and q' 
is the smallest power of 3 larger than q, then we construct a new instance of 
SD-Matching by adding q' — q new elements to each of W, U, and V as follows: 
for 1 < i < g' - (J, add to W, u[ to U, v[ to V, and {w[,u[,v[) to M. It is 
easy to see that the original 3D-Matching instance has a solution if and only if 
this new 3D-Matching instance has a solution. So from now on we can assume 
that g is a power of 3. 

For given instance /, we construct a routing graph as follows. Create vertices 
wi,W2, . . . ,Wq to represent each element in set W , ui,U2, ■ . ■ ,Uq to represent 
each element in set U, and vi,V2, . . ■ ,Vq to represent each element in set V. 



Then create \M\ vertices ii , i2 , • • ■ , i| Af 1 1 f^nd for each element mi ~ (wx ,Uy,Vz) € 
M, add edges {ti,Wx),{ti,Uy),{ti,Vz) of unit cost to the routing graph. Create 
another vertex s, and add edges (s, ti) for i— 1,2,..., \M\ of unit cost. Finally, 
create vertex r, and add an edge (r, s) with cost c. Vertex r is the group controller, 
and W yjU is the set of group members. 

If we set c to be greater than (|M| + 3(7) -S-Sglogg Sq, then using an argument 
similar to the proof of Theorem [2] we can show that the optimal key hierarchy 
is a balanced degree-3 tree. We will next argue that there is a matching in / if 
and only if the cost of the optimal key hierarchy is c • 3 • B^/logg iq + 6(7(8(7 — 1). 

We now calculate the cost of the optimal hierarchy using Equation [2] The 
cost contributed by edge (r, s) is exactly c - 3 • Bglogg 3g. The cost contributed by 
edges {ti,Wx), {ti,Uy) and {ti,Vz) where i = 1, 2, . . . , \M\ and x,y,z = 1, 2, . . . , (7, 
is §(7(3(7 — 1). The cost contributed by edges (s, ti), i = 1, 2, ... , \M\, is at least 
|g(3(7 — 1). This minimum is achieved only if there is a 3D-Matching. So there 
is a solution to the 3D-Matching problem if and only if the cost of the optimal 
logical tree is c • 3 • 3qlog^ 3q + 6(7(8(7 — 1). And this completes the proof of the 
theorem. □ 



5 Approximation algorithms for nonuniform multicast 
costs 

In this section, we present constant-factor approximation algorithms for the 
Key Hierarchy Problem with nonuniform multicast costs. We first show that 
for any instance, there always exists a binary hierarchy that is 3-approximate. 
This guides the design of our approximation algorithms. We next present, in 
Section |5.H an 1 l-approximation algorithm for the case where the underly- 
ing communication network is a tree. Finally, we present, in Section |5.2| a 75- 
approximation algorithm for the most general case of our problem, where the 
communication network is an arbitrary weighted graph. 

Lemma 3. For any instance, there exists a 3-approximate binary hierarchy. 

Proof. Consider any optimal hierarchy T. Following Equation [2] we associate 
with each node u of T a cost equal to W{Tu) X] child v of u ^^(^-u)! refer to this 
cost as nc(it). We show how to transform T to a binary hierarchy by repeatedly 
replacing a node, say u, with degree (i > 2, by a node u' of degree two and a set 
U of at most two other nodes, each with degree strictly less than d. To argue 
the bound on the cost of the binary hierarchy, we use a charging argument: in 
particular, we show that 3nc(M) > nc(M') -I- J2vgu 3nc(w). 

Consider any node u of T of degree greater than two. We consider two cases. 
The first case is where there is no child of u that has weight at least one-third of 
the weight under u. We divide the children of u into two groups such that each 
group has at least one-third of weight under u. If such a partition exists, then 
we replace u by three nodes: u' , ui, and U2. The parent of node u' is the same 
as the parent of u (if it exists). The node u' is the parent for both ui and U2. 



Finally, ui and U2 are the parents of the children of u in the two groups of the 
partition, respectively. 



child V of U 

< iy(r„)(Af(r„j + Af(r„j) + 2H^(rj ^ M(n) 

child u of 

= nc{u')+2W{Tu) J2 M{n) + 2W{T^) ^'^(^-^ 

child V of ui child V of 

< nc(u') + 3nc(wi) + 3nc(u2)- 

The second case is where u has a child ui with weight at least two-third of the 
total weight under u. In this case, we replace u by two nodes u' and U2, with 
u' becoming the parent of ui and U2 , and U2 becoming the parent of the other 
children of u. The parent of u' is the same as that of u (if it exists) . Using a similar 
argument as above, we obtain that 3nc(u) equals 5W{Tu) J2 child v of u '^'^(-^'^)' 
which is at most nc(u') + 3nc(M2). □ 

5.1 Approximation algorithms for routing trees 

In this section, we first give an 11-approximation algorithm for the case where 
weights are nonuniform and the routing network is a tree. Then we analyze the 
more special case with uniform weights, and improve the approximation factor 
to 4.2. 

Given any routing tree, let S be the set of members. We start with defining 
a procedure partition(-) that takes as input the set S and returns a pair {X, v) 
where X is a subset of S and d is a node in the routing tree. First, we determine 
if there is an internal node v that has a subset C of children such that the total 
weight of the members in the subtrees of the routing tree rooted at the nodes 
in C is between W{S)/3 and 2W^(S')/3. If v exists, then we partition S into two 
parts X, which is the set of members in the subtrees rooted at the nodes in C, 
and S\X. It follows that W{S)/3 < W{X) < 2W(S')/3. If w does not exist, then 
it is easy to see that there is a single member with weight more than 2W{S)/3. 
In this case, we set X to be the singleton set which contains this heavy node 
which we call v. The procedure partition(S') returns the pair {X,v). In the 
remainder, we let Y denote S\X. 

ApproxTree( 5) 

1. If 5* is a singleton set, then return the trivial hierarchy with a single node. 

2. (X, v) = partition(S'); let Y denote S\X. 

3. Let A be the cost from root to partition node v.Vi A < M{S)/5, then let 
Ti =ApproxTree(X); otherwise Ti = PTAS(X). (PTAS is the algorithm in- 



troduced in Section 3.2 ) 
T2 =ApproxTree(r). 



5. Return combine(Ti, T2). 



Theorem 4. Algorithm ApproxTree is an {11+ e)- approximation, where e > 
can he made arbitrarily small. 

Proof. Let ALG(S') be the key hierarchy constructed by our algorithm, 0PT(5') 
be the optimal key hierarchy. In the following proof, we abuse our notation and 
use ALG(-) and OPT(-) to refer to both the key hierarchies and their cost. We 
first note that OPT(S') > OPT(X) + OPT(y). 

We prove by induction on the number of members in S that ALG(6') < 
a ■ 0PT(5) + f3 ■ W{S)M{S), for constants a and /3 specified later. The induc- 
tion base, when 15*1 < 2, is trivial. For the induction step, we consider three 
cases depending on the distance to the partition node and whether we obtain 
a balanced partition; wc say that a partition {X,Y) is balanced if ^W{S) < 
W{X),W{Y) < lW{S). The first case is where A < M{S)/5 and the partition 
is balanced. In this case, we have 

ALG(5) = ALG{X) + ALG(r) + W{S) [M{X) + M{Y)] 

< a ■ OPT(X) + /? • W{X)M{X) + a ■ OPT(F) + f3 ■ W{Y)M{Y) 
+W{S) [M{X) + M{Y)] 

< a [OPT(X) + OPT(y)] + + 1) W{S) [M{X) + M{Y)] 



< a ■ OPT{S) + (^^P + 1^ W{S) [M{S) + A] 

< a ■ 0PT(5) + I \ \(3 + 1 ) w{S)M{S) 



5 V3 

< a ■ OPT{S) + 13 ■ w{S)M{S) 

as long as | (|/3 + l) < ,9, which is true if /? > 6. The second case is where A > 
M(S)/5 and the partition is balanced. In this case, we only call the algorithm 
recursively on Y and use PTAS on X. 

ALG(S') = PTAS(X) + ALG(y) + W{S) [M{X) + M{Y)] 

< 5(1 + s) ■ OPT(X) + a ■ OPT(y) + /? • W{Y)M{Y) 
+W{S) [M{X) + M{Y)] 

< a[OPT(X) + OPT(y)] + '^f3W{S)M{S) + 2W{S)M{S) 

< a ■ OPT(S') + 0/3 + 2^ W{S)M{S) 

< a ■ 0PT(5) + 13 ■ W{S)M{S) 

as long as a > 5(1 + e) and |/3 + 2 < /3 which is true if /3 > 6. The third case 
is when the partition is not balanced (i.e. W{X) > '^W{S)). In this case, our 
algorithm connects the heavy node directly to the root of the hierarchy. 



ALG(S') = ALG(r) + W{S) [M{X) + M{Y)] 

< a ■ OPT(y) + /3 • W(Y)M{Y) + W{S) [M{X) + M{Y)] 

< a ■ OPT{S) + \pW{S)M{S) + 2W{S)M{S) 

< a ■ 0PT(5') + Q/3 + 2^ W{S)M{S) 

< a ■ OPT(S') + 13 ■ W{S)M{S) 

as long as |/3 + 2 < (3 which is true if /3 > 3. So, by induction, we have shown 
ALG(5') < a ■ OPT(S') + /S ■ W{S)M{S) for a > 5(1 + e) and /3 > 6. Since 
OPT(S') > W{S)M{S), we obtain an (11 + e)-approximation. □ 

If the member weights are uniform, then we can improve the approximation ratio 
to 4.2 using a more careful analysis of the same algorithm. We refer the reader 
to the appendix for details. 

5.2 Approximation algorithms for routing graphs 

In this section, we give a constant-factor approximation algorithm for the case 
where weights are nonuniform and the routing network is an arbitrary graph. In 
our algorithm, we compute light approximate shortest-path trees (LAST) [M] of 
subgraphs of the routing graph. An (a, /3)-LAST of a given weighted graph G is 
a spanning tree T of G such that the the shortest path in T from a specified root 
to any vertex is at most a times the shortest path from the root to the vertex in 
G, and the total weight of T is at most /3 times the minimum spanning tree of G. 
For any 7 > 0, the algorithm of [14] yields a an (a, /3)-LAST with a=l + V27 
and (3 — 1 + V^/^, where 7 can be chosen as an input parameter. 

ApproxGraph( S") 

1. If S* is a singleton set, return the trivial hierarchy with one node. 

2. Compute the complete graph on S* U {root}. The weight of an edge {u,v) is 
the length of shortest path between u and v in the original routing graph. 

3. Compute the minimum spanning tree on this complete graph. Call it MST(5'). 

4. Compute an (a,/3)-LAST L of MST(S'). 

5. (X, I)) = partition(L). 

6. Let A be the cost from root to partition node L. If A < AI{S)/5, then let 
Ti =ApproxGraph(X). Otherwise, Ti = PTAS(A:). 

7. T2 =ApproxGraph(y). 

8. Return combine(Ti, T2). 

The optimum multicast to a member set is obtained by a minimum Steiner 
tree, computing which is NP-hard. It is well known that the minimum Steiner 
tree is 2-approximated by a minimum spanning tree (MST) in the metric space 
connecting the root to the desired members (the metric being the shortest path 



cost in the routing graph). So at the cost of a factor 2 in the approximation, we 
define M{S) to be the cost of the MST connecting the root to S in the complete 
graph G{S) whose vertex set is S* U {root} and the weight of edge (u, v) is the 
shortest path distance between u and v in the routing graph. 

Theorem 5. The algorithm ApproxGraph is a constant-factor approximation. 

Proof. We prove by induction on the number of members in S that ALG(S') < 
a ■ OPT(S') + (3 ■ W{S)M (S), for constants a and (3 specified later. The induction 
base, when \S\ < 2, is trivial. For the induction step, we consider three cases. 
The first case is Z\ < M{S)/5 and the partition is balanced (as defined in the 
proof of Theorem [4]). Let Ml{S) be the multicast cost to S in LAST. From 
the description of LAST we know Ml{S) < (l + V^/l) ■ M{S). Also we have 
Ml^S) > Ml{X) + Ml(Y)-A> M{X) + M(Y)-A. So (l + V2/j)-M{S) > 
M{X) + M{Y) - A. 

ALG(5) = ALG{X) + ALG(r) + W{S) [M{X) + M{Y)] 

< a ■ OFT{X) + 13 ■ W{X)M{X) + a ■ OPT(y) + /3 • W{Y)M{Y) 
+W{S) [M{X) + M{Y)] 

< a [OPT(X) + OPT(r)] + 0/3 + 1^ W{S) [M{X) + M{Y)] 

< a ■ 0PT(5) + (^^(3 + 1 j W{S) [(l + V2/j'^ M{S) + A 

< a ■ 0PT(5) + ^/t) + 1) W{S)M{S) 

< a ■ OPT{S) + p ■ W{S)M{S) 

as long as (| + ^2/7) + l) < /?. 

The second case is Z\ > M{S)/5 and the partition is balanced. In this case, 
we only call the algorithm recursively on Y and use the PTAS for X. Since 
A > M{S)/5, the distance from the root to any element in X is at least ^^^^ = 

^^^•^ . So the multicast cost to any subset of X is between , and M{S). 

5(1+727) _ 5(1+727) ^ ^ 

By using the PTAS, we have a 5(1 + e)(l + -\/27)-approximation on OPT(X). 

So we have the following bound on ALG(S'). 

ALG(S') = PTAS(X) + ALG(r) + 1^(5') [M{X) + M{Y)] 

< 5 (1 + Vh) (1 + e) • OPT(X) + a ■ OPT(r) + f3 ■ W{Y)M{Y) 
+W{S) [M{X) + M{Y)] 

< a [OPT(X) + OPT(y)] + '^(3W{S)M{S) + 2W{S)M{S) 

o 

< a ■ OPT(S') + Q/3 + 2^ W{S)M{S) 

< a ■ OPT{S) + p ■ W{S)M{S) 



as long as a > 5 (l + V^j) (1 + e) and /3 > 6. 

The third case is when the partition is not balanced. In this case, our algo- 
rithm connect the heavy node directly to the root of key hierarchy. So we have 
the following bound on ALG(S'). 

ALG(S') = ALG(y) + W{S) [M{X) + M{Y)] 

< a ■ OPT(r) + (3 ■ W{Y)M{Y) + W{S) [M{X) + M{Y)] 

< a ■ OPT(S') + \f3W{S)M{S) + 2W{S)M{S) 

< a ■ 0PT(5') + Q/3 + 2^ W{S)M{S) 

< a ■ 0PT(5') + p ■ W{S)M{S) 

as long as /? > 3. So, this algorithm has a constant approximation. 

So, by induction, we have shown ALG(5) < a ■ 0PT(5) + /3 • W{S)M{S), 
implying an (Q;-|-/3)-approximation. When 7 = 7, from the constraints, we obtain 
a > 54 and /? > 21. So we have a 75-approximation. □ 

6 Discussion 

We have presented a constant-factor approximation algorithm for the Key Hi- 
erarchy Problem for the general case where the member weights are nonuni- 
form and the communication network is an arbitrary graph. While we do ob- 
tain improved approximation factors when the communication network is a tree, 
the factors achieved are large and need to be improved. We have also given 
a polynomial-time approximation scheme for the problem instance where all 
multicasts cost the same. We do not know, however, whether this problem is 
NP-complete. As discussed in Section \L2\ the problem is related to the classic 
Huffman coding problem with nonuniform letter costs, whose complexity (P vs 
NP-hardness) is also not yet resolved. 

There are several other directions for future research. We are currently ex- 
ploring the dynamic maintenance of our key hierarchies, explicitly modeling the 
joining and leaving of members, while maintaining the constant-factor approxi- 
mation in cost. We would also like to study the design of key hierarchies where 
the members have a bound on the number of auxiliary keys they store. Also of 
interest is the case where we have no (or limited) information on the update 
frequencies of the members. 
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A Improved approximation for the routing tree case 
when weights are uniform 



For the case where group members have the same key update probabihty and 
the communication network is a tree, using the same algorithm we can show the 
approximation ratio is 4.2 by a different analysis, shown as follows. 

Claim. Balanced partition node can always be found if the members have the 
same key update weight. 

Proof. Suppose this kind of partition node doesn't exist, which means for all 
internal node v, its number of leaves is either < n/3 or > 2n/3. We call nodes 
with less than n/3 leaves small nodes, and nodes with more than 2n/3 leaves 
large nodes. Consider the large node with only small nodes as its children, there 
must be a combination of its children whose total number of leaves is between 
n/3 and 2n/3. This means this kind of partition node exists. □ 

Lemma 4. ALG{S) < ALG{X)+ALG{Y) + jj^^nlogn+n{M{X) + M{Y)). 

Proof. (1) The cost of nodes in ALG(y) is the same as their cost in ALG(S'). 
(2) Similarly, the cost of nodes in ALG{X) is equal to their cost in ALG(S') + 
oi^^ ,„ nlogT7,. The reason we add ^, , ,„ n log n is the multicast cost of each 

3 log 3/2 o 3 log 3/2 ° 

node in ALG{X) increased by A compared to its cost in ALG{X). Since in the 
worst case ALG{X) has log 3 ^ levels, the increased cost is at most 2|X|Z\log3 ^ < 



2f^log,f < 



4/1 



jTilogTi. Combine (1) and (2), then add the cost of the root 



3 "^"6f 3 — 3 log 3/2' 

of ALG{X) and ALG(F), we know this lemma is correct. 



□ 



Lemma 5. OPT{S) > OPT{X) + OPT(Y) 



Proof. To any subset of X, the multicast cost calculated in OPT (5) is A more 
than the cost calculated in OPT{X). From Theorem [l] we know the increased 
cost is at least 3Z\ • nlogg n — j^^nlogn. □ 



Theorem 6 

Proof. 

ALG(S') 
< ALG{X) 



This is a 4. 2- approximation algorithm. 



ALG(y) 



AA 



3 log 3/2 

< a ■ OPT{X) + (3 ■ \X\M{X) + a ■ OPT(y) 
+n {M{X) + M{Y)) 

AA 

OPT(y)] + T^TzzTTT^nlogn 
3A 



nlogn + n {M{X) + M{Y)) 
13 ■ \Y\M{Y) + 



AA 



3 log 3/2 



nlogn 



< a [OPT(X) 
OPT(S') 



< a 



< a 



log 3 



71 log n 



3 log 3/2 
4A 



3Z\ 



3 log 3/2 

iA 
3 log 3/2 



2 



n{M{X) + M{Y)) 
n {M{X) + M{Y)) 



1 



nlogn+ 1^-/3 

nlogn+ (lf3+l ]niM{S) + A) 



= a ■ 0PT(5) + (^3/3 + 1 j riM{S) - a • — nlogn + ^^^^^^^ nlog n 
< a ■ 0PT(5) + /3 • nM{S) 
as long as a > 1.2 and /3 > 3. This means this is a 4.2-approximation. □ 



